def forward
171846d7af5ea91e63db508154eaffe8-Supplemental-Conference.pdf
Next we conduct more experiments on the generalization of unpaired, multi-category and multi-noise-ratio. Thefeatures of points in a point cloud are generated by the same MLP with the same weight. We add the random shiftxs [ smax,smax]3 to all the points of inputs. The last line is the evaluation results of our model trained on the dataset that contains singlenoiseratio. Unsupervised point cloud object cosegmentation by co-contrastive learning and mutual attention sampling.
CUDA-L1: Improving CUDA Optimization via Contrastive Reinforcement Learning
Li, Xiaoya, Sun, Xiaofei, Wang, Albert, Li, Jiwei, Shum, Chris
The exponential growth in demand for GPU computing resources has created an urgent need for automated CUDA optimization strategies. While recent advances in LLMs show promise for code generation, current SOTA models achieve low success rates in improving CUDA speed. In this paper, we introduce CUDA-L1, an automated reinforcement learning framework for CUDA optimization that employs a novel contrastive RL algorithm. CUDA-L1 achieves significant performance improvements on the CUDA optimization task: trained on A100, it delivers an average speedup of x3.12 with a median speedup of x1.42 against default baselines over across all 250 CUDA kernels of KernelBench, with peak speedups reaching x120. In addition to the default baseline provided by KernelBench, CUDA-L1 demonstrates x2.77 over Torch Compile, x2.88 over Torch Compile with reduce overhead, x2.81 over CUDA Graph implementations, and remarkably x7.72 over cuDNN libraries. Furthermore, the model also demonstrates portability across different GPU architectures. Beyond these benchmark results, CUDA-L1 demonstrates several properties: it 1) discovers a variety of CUDA optimization techniques and learns to combine them strategically to achieve optimal performance; 2) uncovers fundamental principles of CUDA optimization, such as the multiplicative nature of optimizations; 3) identifies non-obvious performance bottlenecks and rejects seemingly beneficial optimizations that actually harm performance. The capabilities demonstrate that, RL can transform an initially poor-performing LLM into an effective CUDA optimizer through speedup-based reward signals alone, without human expertise or domain knowledge. This paradigm opens possibilities for automated optimization of CUDA operations, and holds promise to substantially promote GPU efficiency and alleviate the rising pressure on GPU computing resources.
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Deep Learning Model Security: Threats and Defenses
Wang, Tianyang, Bi, Ziqian, Zhang, Yichao, Liu, Ming, Hsieh, Weiche, Feng, Pohsun, Yan, Lawrence K. Q., Wen, Yizhu, Peng, Benji, Liu, Junyu, Chen, Keyu, Zhang, Sen, Li, Ming, Jiang, Chuanqi, Song, Xinyuan, Yang, Junjie, Jing, Bowen, Ren, Jintao, Song, Junhao, Tseng, Hong-Ming, Chen, Silin, Wang, Yunze, Liang, Chia Xin, Xu, Jiawei, Pan, Xuanhe, Wang, Jinlang, Niu, Qian
Deep learning has transformed AI applications but faces critical security challenges, including adversarial attacks, data poisoning, model theft, and privacy leakage. This survey examines these vulnerabilities, detailing their mechanisms and impact on model integrity and confidentiality. Practical implementations, including adversarial examples, label flipping, and backdoor attacks, are explored alongside defenses such as adversarial training, differential privacy, and federated learning, highlighting their strengths and limitations. Advanced methods like contrastive and self-supervised learning are presented for enhancing robustness. The survey concludes with future directions, emphasizing automated defenses, zero-trust architectures, and the security challenges of large AI models. A balanced approach to performance and security is essential for developing reliable deep learning systems.
- Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.04)
- North America > United States > Wisconsin > Dane County > Madison (0.04)
- North America > United States > Hawaii (0.04)
- (10 more...)
- Overview (1.00)
- Workflow (0.94)
- Instructional Material (0.92)
- Research Report (0.81)
Borch: A Deep Universal Probabilistic Programming Language
Belcher, Lewis, Gudmundsson, Johan, Green, Michael
The ability to solve a wide variety of challenging real world problems using machine learning has flourished during the course of the past decade. We've seen advancements within diverse application areas, e.g., vision (Bojarski et al. 2016), natural language and physics (Bakarji et al. 2022). We've also seen the emergence of a new paradigm for machine learning where it is possible to teach a computer how to complete mathematical proofs (Davis 2021; Davies et al. 2021) and even compete in a real-world programming competition (Li et al. 2022). Despite the fact that most of these advances were achieved by neural networks, there are still areas where neural networks are far from being superior to more traditional machine learning methods(Shwartz-Ziv and Armon 2021). The strength in many of these methods lies in that they are easier to interpret and reason about.
- Europe > Denmark > Capital Region > Copenhagen (0.05)
- North America > United States > New York > New York County > New York City (0.04)
- Asia > Middle East > Jordan (0.04)